最小化能量的动力系统在几何和物理学中无处不在。我们为GNN提出了一个梯度流框架,其中方程遵循可学习能量的最陡峭下降的方向。这种方法允许从多粒子的角度来解释GNN的演变,以通过对称“通道混合”矩阵的正和负特征值在特征空间中学习吸引力和排斥力。我们对溶液进行光谱分析,并得出结论,梯度流量图卷积模型可以诱导以图高频为主导的动力学,这对于异性数据集是理想的。我们还描述了对常见GNN体系结构的结构约束,从而将其解释为梯度流。我们进行了彻底的消融研究,以证实我们的理论分析,并在现实世界同质和异性数据集上显示了简单和轻量级模型的竞争性能。
translated by 谷歌翻译
事实证明,与对称性的对称性在深度学习研究中是一种强大的归纳偏见。关于网格处理的最新著作集中在各种天然对称性上,包括翻译,旋转,缩放,节点排列和仪表变换。迄今为止,没有现有的体系结构与所有这些转换都不相同。在本文中,我们提出了一个基于注意力的网格数据的架构,该体系结构与上述所有转换相似。我们的管道依赖于相对切向特征的使用:一种简单,有效,等效性的替代品,可作为输入作为输入。有关浮士德和TOSCA数据集的实验证实,我们提出的架构在这些基准测试中的性能提高了,并且确实是对各种本地/全球转换的均等,因此具有强大的功能。
translated by 谷歌翻译
我们提出了图形耦合振荡器网络(GraphCon),这是一个新颖的图形学习框架。它基于普通微分方程(ODE)的二阶系统的离散化,该系统建模了非线性控制和阻尼振荡器网络,并通过基础图的邻接结构结合。我们的框架的灵活性允许作为耦合函数任何基本的GNN层(例如卷积或注意力),通过该函数,通过该函数通过该函数通过该函数通过该函数通过所提出的ODES的动力学来构建多层深神经网络。我们将GNN中通常遇到的过度厚度问题与基础ode的稳态稳定性联系起来,并表明零二核能能量稳态对于我们提出的ODE不稳定。这表明所提出的框架减轻了过度厚度的问题。此外,我们证明GraphCon减轻了爆炸和消失的梯度问题,以促进对多层GNN的训练。最后,我们证明我们的方法在各种基于图形的学习任务方面就最先进的方法提供了竞争性能。
translated by 谷歌翻译
Designing experiments often requires balancing between learning about the true treatment effects and earning from allocating more samples to the superior treatment. While optimal algorithms for the Multi-Armed Bandit Problem (MABP) provide allocation policies that optimally balance learning and earning, they tend to be computationally expensive. The Gittins Index (GI) is a solution to the MABP that can simultaneously attain optimality and computationally efficiency goals, and it has been recently used in experiments with Bernoulli and Gaussian rewards. For the first time, we present a modification of the GI rule that can be used in experiments with exponentially-distributed rewards. We report its performance in simulated 2- armed and 3-armed experiments. Compared to traditional non-adaptive designs, our novel GI modified design shows operating characteristics comparable in learning (e.g. statistical power) but substantially better in earning (e.g. direct benefits). This illustrates the potential that designs using a GI approach to allocate participants have to improve participant benefits, increase efficiencies, and reduce experimental costs in adaptive multi-armed experiments with exponential rewards.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
Extracting complex structures from grid-based data is a common key step in automated medical image analysis. The conventional solution to recovering tree-structured geometries typically involves computing the minimal cost path through intermediate representations derived from segmentation masks. However, this methodology has significant limitations in the context of projective imaging of tree-structured 3D anatomical data such as coronary arteries, since there are often overlapping branches in the 2D projection. In this work, we propose a novel approach to predicting tree connectivity structure which reformulates the task as an optimization problem over individual steps of a recursive process. We design and train a two-stage model which leverages the UNet and Transformer architectures and introduces an image-based prompting technique. Our proposed method achieves compelling results on a pair of synthetic datasets, and outperforms a shortest-path baseline.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译
Grasping is an incredible ability of animals using their arms and limbs in their daily life. The human hand is an especially astonishing multi-fingered tool for precise grasping, which helped humans to develop the modern world. The implementation of the human grasp to virtual reality and telerobotics is always interesting and challenging at the same time. In this work, authors surveyed, studied, and analyzed the human hand-grasping behavior for the possibilities of haptic grasping in the virtual and remote environment. This work is focused on the motion and force analysis of fingers in human hand grasping scenarios and the paper describes the transition of the human hand grasping towards a tripod haptic grasp model for effective interaction in virtual reality.
translated by 谷歌翻译
Multivariate time series forecasting with hierarchical structure is pervasive in real-world applications, demanding not only predicting each level of the hierarchy, but also reconciling all forecasts to ensure coherency, i.e., the forecasts should satisfy the hierarchical aggregation constraints. Moreover, the disparities of statistical characteristics between levels can be huge, worsened by non-Gaussian distributions and non-linear correlations. To this extent, we propose a novel end-to-end hierarchical time series forecasting model, based on conditioned normalizing flow-based autoregressive transformer reconciliation, to represent complex data distribution while simultaneously reconciling the forecasts to ensure coherency. Unlike other state-of-the-art methods, we achieve the forecasting and reconciliation simultaneously without requiring any explicit post-processing step. In addition, by harnessing the power of deep model, we do not rely on any assumption such as unbiased estimates or Gaussian distribution. Our evaluation experiments are conducted on four real-world hierarchical datasets from different industrial domains (three public ones and a dataset from the application servers of Alipay's data center) and the preliminary results demonstrate efficacy of our proposed method.
translated by 谷歌翻译
Knowledge distillation (KD) has gained a lot of attention in the field of model compression for edge devices thanks to its effectiveness in compressing large powerful networks into smaller lower-capacity models. Online distillation, in which both the teacher and the student are learning collaboratively, has also gained much interest due to its ability to improve on the performance of the networks involved. The Kullback-Leibler (KL) divergence ensures the proper knowledge transfer between the teacher and student. However, most online KD techniques present some bottlenecks under the network capacity gap. By cooperatively and simultaneously training, the models the KL distance becomes incapable of properly minimizing the teacher's and student's distributions. Alongside accuracy, critical edge device applications are in need of well-calibrated compact networks. Confidence calibration provides a sensible way of getting trustworthy predictions. We propose BD-KD: Balancing of Divergences for online Knowledge Distillation. We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process. We demonstrate that, by performing this balancing design at the level of the student distillation loss, we improve upon both performance accuracy and calibration of the compact student network. We conducted extensive experiments using a variety of network architectures and show improvements on multiple datasets including CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet. We illustrate the effectiveness of our approach through comprehensive comparisons and ablations with current state-of-the-art online and offline KD techniques.
translated by 谷歌翻译